113 research outputs found
On Complexity, Energy- and Implementation-Efficiency of Channel Decoders
Future wireless communication systems require efficient and flexible baseband
receivers. Meaningful efficiency metrics are key for design space exploration
to quantify the algorithmic and the implementation complexity of a receiver.
Most of the current established efficiency metrics are based on counting
operations, thus neglecting important issues like data and storage complexity.
In this paper we introduce suitable energy and area efficiency metrics which
resolve the afore-mentioned disadvantages. These are decoded information bit
per energy and throughput per area unit. Efficiency metrics are assessed by
various implementations of turbo decoders, LDPC decoders and convolutional
decoders. New exploration methodologies are presented, which permit an
appropriate benchmarking of implementation efficiency, communications
performance, and flexibility trade-offs. These exploration methodologies are
based on efficiency trajectories rather than a single snapshot metric as done
in state-of-the-art approaches.Comment: Submitted to IEEE Transactions on Communication
TensorQuant - A Simulation Toolbox for Deep Neural Network Quantization
Recent research implies that training and inference of deep neural networks
(DNN) can be computed with low precision numerical representations of the
training/test data, weights and gradients without a general loss in accuracy.
The benefit of such compact representations is twofold: they allow a
significant reduction of the communication bottleneck in distributed DNN
training and faster neural network implementations on hardware accelerators
like FPGAs. Several quantization methods have been proposed to map the original
32-bit floating point problem to low-bit representations. While most related
publications validate the proposed approach on a single DNN topology, it
appears to be evident, that the optimal choice of the quantization method and
number of coding bits is topology dependent. To this end, there is no general
theory available, which would allow users to derive the optimal quantization
during the design of a DNN topology. In this paper, we present a quantization
tool box for the TensorFlow framework. TensorQuant allows a transparent
quantization simulation of existing DNN topologies during training and
inference. TensorQuant supports generic quantization methods and allows
experimental evaluation of the impact of the quantization on single layers as
well as on the full topology. In a first series of experiments with
TensorQuant, we show an analysis of fix-point quantizations of popular CNN
topologies
A Reconfigurable Outer Modem Platform for Future Communications Systems
Future mobile and wireless communications networks
require flexible modem architectures with high performance.
Efficient utilization of
application specific flexibility is key to fulfill these
requirements.
For high throughput a single processor can not provide
the necessary computational power.
Hence multi-processor architectures become necessary.
This paper presents a multi-processor platform based on a new
dynamically reconfigurable application specific instruction set processor (dr-ASIP)
for the application domain of channel decoding.
Inherently parallel decoding tasks can be mapped onto individual processing nodes.
The implied challenging inter-processor communication is efficiently handled
by a Network-on-Chip (NoC) such that the throughput of each node is not degraded.
The dr-ASIP features Viterbi and Log-MAP decoding
for support of convolutional and turbo codes
of more than 10 currently specified mobile and wireless standards.
Furthermore, its flexibility allows for adaptation to future systems
Efficient Hardware Implementation of Constant Time Sampling for HQC
HQC is one of the code-based finalists in the last round of the NIST post
quantum cryptography standardization process. In this process, security and
implementation efficiency are key metrics for the selection of the candidates.
A critical compute kernel with respect to efficient hardware implementations
and security in HQC is the sampling method used to derive random numbers. Due
to its security criticality, recently an updated sampling algorithm was
presented to increase its robustness against side-channel attacks.
In this paper, we pursue a cross layer approach to optimize this new sampling
algorithm to enable an efficient hardware implementation without comprising the
original algorithmic security and side-channel attack robustness.
We compare our cross layer based implementation to a direct hardware
implementation of the original algorithm and to optimized implementations of
the previous sampler version. All implementations are evaluated using the
Xilinx Artix 7 FPGA. Our results show that our approach reduces the latency by
a factor of 24 compared to the original algorithm and by a factor of 28
compared to the previously used sampler with significantly less resources
A Hybrid Approach combining ANN-based and Conventional Demapping in Communication for Efficient FPGA-Implementation
In communication systems, Autoencoder (AE) refers to the concept of replacing
parts of the transmitter and receiver by artificial neural networks (ANNs) to
train the system end-to-end over a channel model. This approach aims to improve
communication performance, especially for varying channel conditions, with the
cost of high computational complexity for training and inference.
Field-programmable gate arrays (FPGAs) have been shown to be a suitable
platform for energy-efficient ANN implementation. However, the high number of
operations and the large model size of ANNs limit the performance on
resource-constrained devices, which is critical for low latency and
high-throughput communication systems. To tackle his challenge, we propose a
novel approach for efficient ANN-based remapping on FPGAs, which combines the
adaptability of the AE with the efficiency of conventional demapping
algorithms. After adaption to channel conditions, the channel characteristics,
implicitly learned by the ANN, are extracted to enable the use of optimized
conventional demapping algorithms for inference. We validate the hardware
efficiency of our approach by providing FPGA implementation results and by
comparing the communication performance to that of conventional systems. Our
work opens a door for the practical application of ANN-based communication
algorithms on FPGAs.Comment: Available at: https://ieeexplore.ieee.org/document/983569
Advanced Wireless Digital Baseband Signal Processing Beyond 100 Gbit/s
International audienceThe continuing trend towards higher data rates in wireless communication systems will, in addition to a higher spectral efficiency and lowest signal processing latencies, lead to throughput requirements for the digital baseband signal processing beyond 100 Gbit/s, which is at least one order of magnitude higher than the tens of Gbit/s targeted in the 5G standardization. At the same time, advances in silicon technology due to shrinking feature sizes and increased performance parameters alone won't provide the necessary gain, especially in energy efficiency for wireless transceivers, which have tightly constrained power and energy budgets. In this paper, we highlight the challenges for wireless digital baseband signal processing beyond 100 Gbit/s and the limitations of today's architectures. Our focus lies on the channel decoding and MIMO detection, which are major sources of complexity in digital baseband signal processing. We discuss techniques on algorithmic and architectural level, which aim to close this gap. For the first time we show Turbo-Code decoding techniques towards 100 Gbit/s and a complete MIMO receiver beyond 100 Gbit/s in 28 nm technology
Unsupervised ANN-Based Equalizer and Its Trainable FPGA Implementation
In recent years, communication engineers put strong emphasis on artificial
neural network (ANN)-based algorithms with the aim of increasing the
flexibility and autonomy of the system and its components. In this context,
unsupervised training is of special interest as it enables adaptation without
the overhead of transmitting pilot symbols. In this work, we present a novel
ANN-based, unsupervised equalizer and its trainable field programmable gate
array (FPGA) implementation. We demonstrate that our custom loss function
allows the ANN to adapt for varying channel conditions, approaching the
performance of a supervised baseline. Furthermore, as a first step towards a
practical communication system, we design an efficient FPGA implementation of
our proposed algorithm, which achieves a throughput in the order of Gbit/s,
outperforming a high-performance GPU by a large margin.Comment: accepted for publication at Joint European Conference on Networks and
Communications & 6G Summit (EuCNC/6G Summit), Gothenburg, Sweden, 6 - 9 June
202
- …